On power law distributions in large-scale taxonomies
نویسندگان
چکیده
منابع مشابه
Comparative Classifier Evaluation for Web-Scale Taxonomies Using Power Law
In the context of web-scale taxonomies such as Mozilla and Yahoo! 1 directories, previous works have shown the existence of power law distribution in the size of the categories for every level in the taxonomy. In this work, we analyse how such high-level semantics can be leveraged to evaluate accuracy of hierarchical classifiers which automatically assign the unseen documents to leaf-level cate...
متن کاملSampling power-law distributions
Power-law distributions describe many phenomena related to rock fracture. Data collected to measure the parameters of such distributions only represent samples from some underlying population. Without proper consideration of the scale and size limitations of such data, estimates of the population parameters, particularly the exponent D, are likely to be biased. A Monte Carlo simulation of the s...
متن کاملPower-law citation distributions are not scale-free
We analyze time evolution of statistical distributions of citations to scientific papers published in the same year. While these distributions seem to follow the power-law dependence we find that they are nonstationary and the exponent of the power-law fit decreases with time and does not come to saturation. We attribute the nonstationarity of citation distributions to different longevity of th...
متن کاملOn Flat versus Hierarchical Classification in Large-Scale Taxonomies
We study in this paper flat and hierarchical classification strategies in the context of large-scale taxonomies. To this end, we first propose a multiclass, hierarchical data dependent bound on the generalization error of classifiers deployed in large-scale taxonomies. This bound provides an explanation to several empirical results reported in the literature, related to the performance of flat ...
متن کاملUpper-truncated Power Law Distributions
Power law cumulative number-size distributions are widely used to describe the scaling properties of data sets and to establish scale invariance. We derive the relationships between the scaling exponents of non-cumulative and cumulative number-size distributions for linearly binned and logarithmically binned data. Cumulative number-size distributions for data sets of many natural phenomena exhi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGKDD Explorations Newsletter
سال: 2014
ISSN: 1931-0145,1931-0153
DOI: 10.1145/2674026.2674033